skip to main content
US FlagAn official website of the United States government
dot gov icon
Official websites use .gov
A .gov website belongs to an official government organization in the United States.
https lock icon
Secure .gov websites use HTTPS
A lock ( lock ) or https:// means you've safely connected to the .gov website. Share sensitive information only on official, secure websites.


Search for: All records

Creators/Authors contains: "Tancik, Matthew"

Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher. Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?

Some links on this page may take you to non-federal websites. Their policies may differ from this site.

  1. gsplat is an open-source library designed for training and developing Gaussian Splat- ting methods. It features a front-end with Python bindings compatible with the Py- Torch library and a back-end with highly optimized CUDA kernels. gsplat o↵ers nu- merous features that enhance the optimization of Gaussian Splatting models, which in- clude optimization improvements for speed, memory, and convergence times. Experimen- tal results demonstrate that gsplat achieves up to 10% less training time and 4⇥ less memory than the original Kerbl et al. (2023) implementation. Utilized in several re- search projects, gsplat is actively maintained on GitHub. Source code is available at https://github.com/nerfstudio-project/gsplat under Apache License 2.0. We wel- come contributions from the open-source community. 
    more » « less
    Free, publicly-accessible full text available February 1, 2026
  2. Grouping is inherently ambiguous due to the multiple levels of granularity in which one can decompose a scene -- should the wheels of an excavator be considered separate or part of the whole? We present Group Anything with Radiance Fields (GARField), an approach for decomposing 3D scenes into a hierarchy of semantically meaningful groups from posed image inputs. To do this we embrace group ambiguity through physical scale: by optimizing a scale-conditioned 3D affinity feature field, a point in the world can belong to different groups of different sizes. We optimize this field from a set of 2D masks provided by Segment Anything (SAM) in a way that respects coarse-to-fine hierarchy, using scale to consistently fuse conflicting masks from different viewpoints. From this field we can derive a hierarchy of possible groupings via automatic tree construction or user interaction. We evaluate GARField on a variety of in-the-wild scenes and find it effectively extracts groups at many levels: clusters of objects, objects, and various subparts. GARField inherently represents multi-view consistent groupings and produces higher fidelity groups than the input SAM masks. GARField's hierarchical grouping could have exciting downstream applications such as 3D asset extraction or dynamic scene understanding. See the project website at https://www.garfield.studio/ 
    more » « less
  3. null (Ed.)
    Widely used in news, business, and educational media, infographics are handcrafted to effectively communicate messages about complex and often abstract topics including `ways to conserve the environment' and `coronavirus prevention'. The computational understanding of infographics required for future applications like automatic captioning, summarization, search, and question-answering, will depend on being able to parse the visual and textual elements contained within. However, being composed of stylistically and semantically diverse visual and textual elements, infographics pose challenges for current A.I. systems. While automatic text extraction works reasonably well on infographics, standard object detection algorithms fail to identify the stand-alone visual elements in infographics that we refer to as `icons'. In this paper, we propose a novel approach to train an object detector using synthetically-generated data, and show that it succeeds at generalizing to detecting icons within in-the-wild infographics. We further pair our icon detection approach with an icon classifier and a state-of-the-art text detector to demonstrate three demo applications: topic prediction, multi-modal summarization, and multi-modal search. Parsing the visual and textual elements within infographics provides us with the first steps towards automatic infographic understanding. 
    more » « less
  4. Imaging through highly scattering media holds many opportunities in underwater and biomedical imaging. Here we leverage a single photon avalanche diode (SPAD) camera, and experimentally demonstrate an imaging pipeline to see through turbid water in optical reflection mode. 
    more » « less